Support tool callbacks in MCP sampling#2998
Conversation
Adds a parallel SamplingWithToolsHandler alongside the existing SamplingHandler so MCP servers can include a tools array in sampling/createMessage requests. The host drives its model with those tools and returns any tool_use blocks as ToolUseContent; the server remains responsible for executing the tool and continuing the loop in a follow-up sampling request. The initialize handshake now advertises sampling.tools capability, and the MCP toolset selects the appropriate go-sdk handler (basic vs. with-tools) based on which handler is registered.
Mounts an in-process gomcp.NewServer on an httptest server via StreamableHTTPHandler. Its one tool, ask_with_calculator, runs a sampling loop: sends sampling/createMessage with a calculator tool, gets a tool_use back from the host LLM, "executes" the calculator, sends a follow-up sampling request carrying the tool_result, and returns the final text. The Gemini side is recorded once and replayed on subsequent runs, so the test runs offline in CI.
|
/review |
docker-agent
left a comment
There was a problem hiding this comment.
Assessment: 🟡 NEEDS ATTENTION
Two medium-severity findings in the newly-added sampling-with-tools code. The core stream-aggregation logic, capability handshake, content building, and limits enforcement are all well-structured and correctly tested.
| PromptListChangedHandler: promptChanged, | ||
| } | ||
| switch { | ||
| case c.samplingWithToolsHandler != nil: |
There was a problem hiding this comment.
[MEDIUM] Handler selection baked at Initialize time — reconnect may silently lose sampling-with-tools capability
The SDK CreateMessageWithToolsHandler (vs CreateMessageHandler) is chosen once when Initialize runs by reading c.samplingWithToolsHandler at that moment:
switch {
case c.samplingWithToolsHandler != nil:
opts.CreateMessageWithToolsHandler = c.handleSamplingWithToolsRequest
case c.samplingHandler != nil:
opts.CreateMessageHandler = c.handleSamplingRequest
}In the normal startup path this is fine: configureToolsetHandlers (line 221 of loop.go) calls SetSamplingWithToolsHandler before getTools (line 223) triggers Initialize, so the handler is present when the switch executes.
The concern is the reconnect path: if Initialize is invoked again (e.g., after a dropped connection) without configureToolsetHandlers being re-called first, the stored handler fields still hold the right value, so this should be fine in practice. However, if a future refactor re-orders the reconnect sequence or resets the handler fields, the MCP session would silently fall back to no handler — the server's sampling/createMessage requests with tools would receive no response and time out.
Consider using an accessor closure (captured at startup time) instead of reading the field at Initialize, or adding a comment to Initialize noting the ordering dependency so future maintainers know not to call it before the handlers are set.
There was a problem hiding this comment.
Addressed in 9388968 with the lightweight option — extended the existing // Sampling: comment in both stdio.go and remote.go to call out the ordering dependency (SetSampling*Handler must run before Initialize, including on reconnect). Skipped the closure refactor since the handler fields are sticky on sessionClient and reconnect reuses the same client instance — happy to revisit if you'd prefer the stronger guarantee.
- Reject tool_use blocks under a non-assistant role explicitly rather than silently dropping the message. The MCP spec places tool_use on assistant turns, but a malformed server would previously have its message disappear from the converted chat history with no error. - Document the ordering dependency between SetSampling*Handler and Initialize in both stdio and remote MCP client setups, so future maintainers don't reorder them and silently lose the sampling handler on reconnect.
Summary
Closes the tool callbacks functional gap in MCP sampling support — a follow-up to #2815, addressing one of the remaining items from #2809.
When an MCP server includes a
toolsarray in asampling/createMessagerequest, the host now drives its model with those tools and returns anytool_useblocks back to the server asToolUseContent. The server remains responsible for executing the tool and continuing the loop in a follow-up sampling request.sequenceDiagram participant H as cagent participant S as MCP Server participant L as LLM activate H H->>+S: tools/call {name, arguments} note over S: needs LLM inference S->>+H: sampling/createMessage<br/>{messages, tools: [...]} H->>+L: chat completion L-->>-H: ToolUseContent<br/>stopReason: "toolUse" H-->>-S: CreateMessageResult<br/>{tool_use, stopReason: "toolUse"} note over S: executes tool locally S->>+H: sampling/createMessage<br/>{messages + tool_use + tool_result, tools: [...]} H->>+L: chat completion L-->>-H: TextContent<br/>stopReason: "endTurn" H-->>-S: CreateMessageResult<br/>{text, stopReason: "endTurn"} S-->>-H: tool result deactivate HWhat's new
SamplingWithToolsHandlertype andSampleableWithToolsinterface — additive, parallel to the existingSamplingHandler/Sampleable. No breaking changes to the basic sampling path merged in feat(mcp): add sampling/createMessage support #2815.Initialize, exactly one of the SDK's mutually exclusiveClientOptions.CreateMessage*fields is populated — prefer with-tools when registered, fall back to basic.sampling.toolsso servers know the host can receive tool-enabled requests.pkg/runtime/sampling.go):text,image/audio,tool_use→ assistantToolCalls,tool_result→MessageRoleToolrows (parallel tool_results expand to multiple chat.Message rows).[]*mcp.Tool→[]tools.Toolwith a no-op handler (the server, not the host, executes).model.CreateChatCompletionStream, aggregates streamed tool calls.ContentwithTextContent+ToolUseContentblocks;stopReason: "toolUse"when tool calls are present.maxSamplingTools=64,maxSamplingToolCalls=32.e2e/sampling_test.go): mounts an in-processgomcp.NewServeron anhttptestserver viaStreamableHTTPHandler. The server exposes one tool (ask_with_calculator) whose handler drives a real sampling-with-tools loop against the connecting cagent. The Gemini side is recorded once and replayed on subsequent runs, so the test runs offline in CI.Out of scope (separate gaps from #2809)
Test plan